Unfolding mechanism and the free energy landscape of a single stranded DNA i-motif 
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We present Molecular Dynamics simulations of a single stranded unprotonated DNA i-motif in 
explicit solvent. Our results indicate that the native structure in non- acidic solution at 300 K is 
unstable and completely vanishes on a time scale up to 10 ns. Two unfolding mechanisms with 
decreasing connectivity between the initially interacting nucleobases can be identified where one 
pathway is characterized as entropically more favorable. The entropic preference can be mainly 
explained by strong water ordering effects due to hydrogen bonds for several occurring structures 
along the pathways. Finally we are able to indicate via free energy calculations the most stable 
configurations belonging to distinct hairpin structures in good agreement to experimental results. 
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INTRODUCTION 



The appearance of non Watson- Crick like structures in 
DNA has been reported two decades ago Since this 
time a lot of effort has been spent to investigate these 
conformations and possible applications in detail [!-[5|. 
Experiments lead to the conclusion that these structures 
are the only known DNA configurations that involve sys- 
tematic base intercalation [2| . Prominent representatives 
are the G-quadruplex structures and the i-motif [5] where 
the first one is formed by guanine (G) rich sequences 
[3] while the latter is present in more cytosine (C) rich 
strands of DNA 0. 

The stabilizing mechanism for these at a first glance frag- 
ile structures is realized by a proton mediated cytosine 
binding between different strands or regions of the se- 
quence resulting in a stable C-CH+ pairing [l], 0, 0, [Hj. 
Due to an acidic environment, this is achieved by hemi- 
protonated cytosines which mimick an ordinary C-G 
binding as it is present in double helix DNA. Hence it 
becomes clear that these structures are only occurring at 
slightly acidic to neutral conditions resulting in pH values 
from 4.8 to 7.0 [H, 0, E[. I- motifs show a remarkable sta- 
bility [6] and have been found as tetrameric and dimeric 
complexes although their existence has also been proven 
for single stranded DNA [2]. A sketch of the C-CH+ 
complex where the additional proton mediates a hydro- 
gen bond between the nitrogens of the cytosine groups 
and the corresponding single stranded i-motif with its 
sequence is shown in Fig. [TJ 

Due to its biological appearance in centromeric and 
telomeric DNA, the distinct i-motif conformations have 
been discussed as a new class of possible biological targets 
for cancers and other diseases 0, H|. However, a detailed 
investigation of the function in the human cell is still 
missing. Despite this lack of knowledge, the application 
of this special configuration in modern biotechnology has 
experienced an enormous growth over the last years [4j. 
Since the i-motif becomes unstable at pH values larger 



than 7, a systematic decrease and increase of protons in 
the solution by changing the pH value results in a re- 
versible folding and unfolding mechanism. It has been 
shown that this process occurs on a timescale of seconds 



Technological applications for this mechanism are given 
by molecular nanomachines 0, 0], switchable nanocon- 
tainers llOL pH sensors to detect the pH value inside liv- 
ing cells , building materials for logic gate devices [HI 
and sensors for distinguishing single walled and multi- 
walled carbon nanotube systems [13] . Recently it has 
been reported [14(, that the grafting density massively 
influences the structure of an i-motif layer in nanode- 
vices due to steric hindrance. Regarding these examples 
it becomes clear that a detailed investigation of the un- 
folding pathway of the i-motif is of prior importance. 
In this paper we present the results of Molecular Dy- 
namics simulations concerning the unfolding mechanism 
of a maximum unstable single stranded DNA i-motif 
structure without hemi-protonated cytosines. Our re- 
sults indicate a fast initial decay of the i-motif leading 
to hairpin structures on a timescale up to 10 ns which 
dominate the unfolded regime in contrast to a fully ex- 
tended strand. The numerical findings are validated by 
experimental Circular Dichroism (CD) spectropolarime- 
try data. By distinct investigation of the unfolding path- 
ways, two main mechanisms can be identified which sig- 
nificantly differ in their entropic properties. We are able 
to separate the contributions of the solvent and the chain 
configurational entropy explicitly to determine the influ- 
ence on the unfolding pathways. The experimental re- 
sults can be explained by a temperature dependent en- 
tropy for each conformation which is significantly domi- 
nated by the present number of hydrogen bonds with the 
solvent. 

The paper is organized as follows. In the next section 
we present the numerical and experimental details. The 
results are presented in the fourth section. We conclude 
with a brief summary in the last section. 
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NUMERICAL DETAILS 

We have performed our Molecular Dynamics simula- 
tions of the i-motif in explicit TIP3P solvent at 300 
K by the GROMACS software package [15] with the 
ffAmber03 force field [16]. The single DNA strand con- 
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FIG. 1: C-CH + pairing (bottom left) which is responsible for 
the formation of the DNA i-motif (top left) with the corre- 
sponding sequence (top right). The dashed lines represent the 
initially interacting nucleobases where C,T and A denote cy- 
tosine, adenine and thymine. A coarse grained model of the 
i-motif is shown in the right bottom. 

sists of 22 nucleic acid bases given by the sequence 
5' - CCC - [TAA - CCC} 3 - T - 3' where T, A and C 
denote thymine, adenine and cytosine. We modeled this 
structure which is directly related to the sequence used 
in Q by the PDB entry 1ELN [l7|. The cubic simula- 
tion box with periodic boundary conditions has a dimen- 
sion of (5.41 x 5.41 x 5.41) nm filled with 5495 TIP3P 
water molecules. The negative charge of — 22e of the 
backbone has been compensated by 22 sodium ions. We 
applied a Nose-Hoover thermostat to the system where 
all bonds have been constrained by the LINCS algorithm 
[l5| . Electrostatics have been calculated by the PME 
algorithm and the timestep was 2 fs. After energy mini- 
mization by a steepest descend method, the initial warm 
up phase of 1 ns has been performed by keeping the po- 
sition of the DNA molecule restrained. For a detailed 
investigation of the unfolding mechanism we conducted 
five 300 K simulations each with 10 ns duration to cal- 
culate the average values. 

The calculation of the free energy landscape has been 
performed by the metadynamics method presented in 
Ref. [18[ . The biased metadynamics simulations at 300 K 
have been conducted by the program plug-in PLUMED 
[ioj . The Gaussian hills were set each 2 ps with a height 
of 0.25 kJ/mol and a width of 0.25 nm. The correspond- 
ing reaction coordinates for the biased energy are the dis- 



tance between nucleobase CI and T22 and the distance 
between the center-of-mass for the combined nucleobases 
C1-T22 to the nucleobase All. The final free energy 
landscapes have been refined by histogram reweighting 
[2l| of 15 biased simulations of 10 ns at 300 K by the 
method introduced in Ref. [l9| . The eigenvector free en- 
ergy landscape has been calculated by using a projection 
scheme [19j. 

To simplify our results for the kinetic investigations, 
we paired each three nucleobases in one group resulting 
in the sequence CYS1-TAA1-CYS2-TAA2-CYS3-TAA3- 
CYS4-T where CYS and TAA equals CCC, respectively 
TAA as shown as in Fig. [TJ 

The calculation of the thermodynamic properties has 
been performed by keeping the position of each structure 
restrained for 100 ps. 



EXPERIMENTAL DETAILS 

The oligodeoxynucleotide was purchased from Sangon 
Co., Ltd. The sequence was identical to the simulations 
and it was dissolved in a final buffer with 50 mM MES 
and 50 mM NaCl. The buffer had a pH value of 8.0 
and the concentration of DNA was 10 jaM. We used Cir- 
cular Dichroism (CD) spectropolarimetry to investigate 
the structural behaviour. All CD measurements were 
recorded on Jasco Spectropolarimeter (J-810) equipped 
with a programmable temperature-control unit. The 
range of scanning wavelength was from 350 to 220 nm 
for two different temperatures 298.15 K and 368.15 K. 



RESULTS 
Kinetic properties 

We start this section by presenting the results for 
the unbiased simulations. For this we calculated the 
number of hydrogen bonds between the CYS1-CYS3, 
CYS2-CYS4 and the TAA1 and TAA3 group (Fig. QJ 
as a function of time shown in Fig. [2] It has been 
reported [3, [sj that the binding between these groups 
is essential for the stability of the i-motif. An initial 
decay of hydrogen bonds for the cytosine groups can 
be observed up to 1 ns. This vanishing can be roughly 
approximated by an exponential function exp(— tjr) 
where the constant can be fitted to r ^ 450 ps for the 
CYS2-CYS4 pair and r « 130 ps for the CYS1-CYS3 
binding. The configurational snapshot at 600 ps shows a 
broadened i-motif. On longer timescales of 2-5 ns the 3' 
end slightly opens. After 7 ns a stable configuration can 
be identified which is dominated by the open 3' end with 
the CYS4-T groups and the 5' end which is interacting 
with the central loop region. The number of hydrogen 
bonds between the TAA1-TAA3 connection is nearly 



constant up to 10 ns. Thus the observed final structure 
can be best described by a hairpin configuration (iij . 
However regarding the short decay time of the initial 
hydrogen bonds, it becomes obvious that the starting 
structure in absence of hemi-protonated cytosines is 
very unstable in agreement to the results reported in 
Refs. QSSEElj. 

To investigate the fully accessible phase space, we 
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FIG. 2: Number of hydrogen bonds between the groups 
CYS1-CYS3, CYS2-CYS4 and TAA1-TAA3 up to 1 ns (top) 
and 10 ns (bottom) averaged over 5 simulations. Typical 
configurations are shown as snapshots. The initial vanish- 
ing of hydrogen bonds between the CYS2-CYS4 group can be 
roughly approximated by an exponential fit with r « 450 ps, 
respectively 130 ps for the CYS1-CYS3 groups. 

applied the metadynamics technique in which a history 
dependent biasing potential is applied to the molecule 
which helps to overcome energetic barriers. Details 

We 
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of the method can be found elsewhere 
calculated the distance matrices for the nucleobases of 
the i-motif and further occurring conformations at later 
simulation times presented in Fig. [3l It is obvious that 
the i-motif (Panel a) represents a well-defined structure 
with many local interactions even for long distances 
along the backbone. Two further structures (Panels b 
and c) differ in their nearest neighbor interactions. The 
structure of (b) is also shown in Fig. [2] after 7 ns and the 
structure of (c) is a fully planar hairpin structure which 








FIG. 3: Distance matrices for the 22 nucleobases of the i- 
motif (left) and three other conformations. Panel a) shows 
the initial i-motif. Panel b) and c) present the results for two 
hairpin structures and Panel d) is related to a fully extended 
strand. 



indicates cross-like interactions. Local interactions for 
all nucleobases between C1-A12 can be observed in (b) 
whereas (c) indicates interactions between the opposite 
sides of the strand. Hence these results are in good 
agreement to Fig. [2] where it is shown that the opening 
of the i-motif is also initialized by the 3' end. In addition 
it has to be mentioned that we also have observed a 
fully extended configuration in our simulations (Panel 
d). This conformation is characterized by the neglect 
of the cross-like structure shown in Panel (c). It can be 
concluded that only local nearest neighbor interactions 
along the backbone are present with short distances 
between the nucleobases. Hence the vanishing of the 
i-motif can be easily rationalized in terms of distance 
matrices showing a decrease in the connectivity of the 
nucleobases for longer simulation times. 



Free energy landscapes 

The study of stable structures is achieved by the in- 
vestigation of the free energy landscape. Generally, the 
calculation of free energy landscapes is a challenging task 
regarding problems like finite simulation time resulting in 
low sampling efficiencies for several regions as well as the 
neglect of structural transitions representing rare events 
18 . 19| . Nevertheless, a lot of computational algorithms 
have been published all over the years to overcome these 
problems. 

Metadynamics allows to calculate the free energy land- 
scape after filling the free energy minima by an additional 
biasing energy in form of Gaussian hills. The Gaussians 



FIG. 4: Free energy landscape for the distances between the 
center of mass for combined CI and T22 and the distance 
to All rci/T22-Aii an d the distance between CI and T22 
tci-T22 (top). The lines correspond to energy differences of 
1.5 kcal/mol. 



are set at each r relaxation steps and are used to over- 
come energetic barriers such that the whole phase space 
is accessible. As collective variables we chose the dis- 
tance between the center of masses of the All and the 
combined CI and T22 nucleobases. As a second reaction 
coordinate we chose the distance between the center of 
masses for the CI and the T22 group. We applied these 
observables due to the large variations in the distance 
matrices shown in Fig. [3l The results for the free energy 
landscape are presented in Fig. HI 

Two large minima can be identified in a funnel-like land- 
scape [24| with energy differences to the native structure 
around -8 kcal/mol. These results indicates them as very 
stable in contrast to the i-motif. Regarding these confor- 
mations in detail, it becomes clear that these structures 
belong to the planar and partly planar structures shown 
in Fig. [3] (Panel b and c). The fully extended strand 
which was also observed in our biased simulations can be 
identified as energetically less favourable. By regarding 
the overlap of the two minima in each direction, it be- 
comes clear that a separate calculation of the free energy 
for each coordinate would result in a significant decrease 
and error of the barriers and the minima [25| . Hence only 
the usage of a two-dimensional representation leads to a 
sufficient estimate of the free energy differences. 
We have also calculated the free energy landscape for 
the essential eigenvectors of the system [26|. Eigenvec- 
tors have been shown as useful to capture the main con- 
certed motion of the molecule [26] . Mathematically they 
are closely related to principal component analysis. The 
main motion is described by the first eigenvectors which 
form the essential subspace whereas the fluctuating mo- 
tion of the higher eigenvectors represents the remain- 
ing subspace. A detailed description can be found in 
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To calculate the eigenvectors for the com- 
plete unfolding trajectories into the stretched structure 
we have performed five 500 K unbiased high tempera- 
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FIG. 5: Free energy landscape for the eigenvectors 1 and 2 
with the corresponding stable configurations (top) and un- 
folding pathways (bottom) . The unfolding pathways with the 
corresponding free energy differences AF and free energy bar- 
riers A are shown in the bottom following the notation in the 
top figure. Transitions without given A values indicate the 
disappearance of barriers between the configurations. 



ture simulations each with 10 ns. The simulations have 
been combined to calculate an unbiased high tempera- 
ture averaged eigenvector set. This set was used for the 
projection of the the free energy landscape onto the es- 
sential subspace of the first eigenvectors [l9|, [26[ in the 
low temperature regime at 300 K. A recent publication 
[27( | has validated this technique. 

The final two-dimensional energy landscape with the ob- 
served unfolding pathways is presented in Fig. Two 
large minima with free energies around AF « —8 
kcal/mol belonging to conformations (b) and (c) can be 
identified. In contrast to them a local energetic mini- 
mum associated with a stretched structure (e) given by 
a free energy difference AF « — 5 kcal/mol constitutes 
a metastable configuration. These values are in good 
agreement to the results shown in Fig. |4j By analyz- 
ing the biased metadynamics pathway and corresponding 
high temperature unfolding simulations [27j , we are able 
to discover the main unfolding pathways as presented in 
the bottom of Fig. [5] 

Pathway 1 follows the spontaneous unfolding into hairpin 
structure (b) with a free energy difference of —8 kcal/mol 
without notable energy barriers. From structure (b) one 
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option is to unfold into structure (c) with a barrier of 
A = 7.5 kcal/mol or directly transform into structure (d) 
with identical values for the barrier height. Having un- 
folded into structure (d), the s-shape form can transform 
into a fully extended strand (e) with a free energy differ- 
ence of —5 kcal/mol by overcoming a free energy barrier 
of 3 kcal/mol. It has to be mentioned that structural 
transformations into (d) and (e) are largely hindered due 
to energetic arguments such that it can be concluded that 
the hairpin structures represent the stable conformations 
at 300 K. 

The other pathway is realized by a torsional motion 
into structure (g) which is energetically separated by 1.5 
kcal/mol from the i- motif. The transition from structure 
(g) into (f) is hindered by an energy barrier of roughly 
A ^ 1.5 kcal/mol until it freely unfolds into structure (d) 
to join the first pathway into the fully extended strand. 
Combining all results, it finally can be concluded that 
the global minima given by the hairpin configurations 
are energetically more favourable than other structures. 
Additionally it has to be mentioned that these confor- 
mations are stabilized due to large energy barriers which 
are in the range of the free energy differences. 
The one-dimensional representation for each eigenvector 
calculated by a projection scheme [HI is presented on 
the left side of Fig. [6] whereas the corresponding con- 
certed motion is shown on the right side. It can be seen 
that eigenvector 1 describes the variation of the end-to- 
end distance by a stretching motion whereas eigenvector 
2 mainly captures the relaxation from the i-motif to the 
planar structure. 

The deepest minimum along eigenvector 1 can be found 
for the hairpin configurations where the values are in 
good agreement to the results derived in Fig. [5] It is ob- 
vious that the averaged barrier height between the hair- 
pin structures and the stretched structures is above 6 
kcal/mol. For a detailed investigation of the two hair- 
pin structures we have calculated the average values for 
eigenvector 2 along a window of eigenvector 1 between -20 
nm and -10 nm resulting in an averaged calculated barrier 
height of around 5 kcal/mol. Hence it can be concluded, 
that after regarding all results the hairpin structures rep- 
resent the most energetically stable configurations. 



Thermodynamic properties 

For a closer view on the unfolding mechanisms, we have 
calculated the total energy and the temperature entropy 
contribution to the free energy in Tab. [I] for each config- 
uration shown in Fig. [5] relative to the i-motif. The total 
entropy is given by the relation AS = (AU — AF)/T 
where AF and AU denote the free and the total en- 
ergy differences whereas T represents the systems tem- 
perature. It becomes clear that the most energetically 
favourable conformations are given by structures (d), (f) 
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FIG. 6: Free energy landscape for the one-dimensional rep- 
resentation along eigenvector 1 (top left) and eigenvector 2 
(bottom left) with the corresponding conformations. The val- 
ues for eigenvector 2 have been computed along the values -20 
nm to -10 nm of eigenvector 1 to compute the average barrier 
between the hairpin structures. The concerted motion of the 
eigenvectors 1 and 2 is shown in the right as indicated by the 
red arrows. 



and (g) which are representing the second pathway. How- 
ever, the large deficit of entropy compared to the other 
structures indicates these conformations and in general 
the second pathway as unfavorable in agreement to the 
results shown in Fig. [5] Compared to the i-motif, it is 
surprising that all other structures are entropically less 
favourable. This can be related to structural changes in 
the surrounding water shell as we will show in the follow- 
ing. 

For this we have calculated the entropic contributions by 
the relation AS = AS e + AS C separately where AS e rep- 
resents the change in the environmental entropy and A5 C 
the change in the configurational entropy calculated by 
a quasi-harmonic approach [28j. The results are shown 
in Tab. HH Both pathways are comparable in their con- 
figurational entropy gain. However, a significant contri- 
bution to the total entropy arises from local environment 
changes AS e . All values are negative, leading to the con- 
clusion that the water around the initial i-motif has a 
larger entropy. The largest deviations occur for struc- 
tures (d), (f) and (g) belonging to the second pathway. 
This also supports the fact that the second pathway is 
entropically less preferable. 

The variation of AS e and the total energy difference AU 
can be rationalized by regarding the values for the num- 
ber of hydrogen bonds relative to the i-motif Ann- Re- 
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TABLE I: Total energy AU and temperature-entropy con- 
tribution TAS to the free energy difference for the different 
configurations shown in Fig.[5]relative to the i- motif structure. 





AU [kcal/mof 


TAS [kcal/mof 


AF [kcal/mol] 


b 


-59 


-51 


-8 


c 


-21 


-14 


-7 


d 


-127 


-124 


-3 


e 


-62 


-57 


-5 


f 


-140 


-139 


-1 


g 


-177 


-176 


-1 



TABLE III: Total energy AU and temperature-entropy con- 
tribution TAS to the free energy difference for the different 
configurations shown in Fig. [5] relative to the i-motif structure 
at 500 K. 





AU [kcal/mol 

L / J 


TAS [kcal/mol" 

L / J 


AF [kcal/moll 

L / J 


b 


-31 


-29 


-2 


c 


-7 


-4 


-3 


d 


-48 


-45 


-3 


e 


-10 


-7 


-3 


f 


-47 


-45 


-2 



-108 -106 -2 



TABLE II: Total entropy AS, configurational entropy AS C , 
environment entropy AS'e and number of hydrogen bonds 
Auh for the different configurations shown in Fig. [5] relative 
to the i-motif structure 





AS [kcal/K mof 


AS C [kcal/K mof 


AS e [kcal/K mof 


An H 


b 


-0.17 


0.35 


-0.52 


11 


c 


-0.05 


0.34 


-0.39 


1 


d 


-0.41 


0.34 


-0.75 


20 


e 


-0.19 


0.35 


-0.54 


31 


f 


-0.46 


0.34 


-0.80 


25 


g 


-0.59 


0.33 


-0.92 


37 



TABLE IV: Total entropy AS, configurational entropy AS C , 
environment entropy AS e and number of hydrogen bonds 
Auh for the different configurations shown in Fig. [5] relative 
to the i-motif structure at 500 K 





AS [kcal/K mof 


AS C [kcal/K mof 


AS e [kcal/K mof 


An H 


b 


-0.05 


0.44 


-0.49 


11 


c 


-0.01 


0.43 


-0.44 


1 


d 


-0.09 


0.44 


-0.53 


20 


e 


-0.01 


0.45 


-0.46 


23 


f 


-0.09 


0.43 


-0.52 


17 



-0.21 0.43 -0.61 26 



gar ding the values of Tab.HIlit becomes clear that a large 
number of hydrogen bonds between the solvent and the 
DNA leads to a significant deficit of entropy due to local 
ordering and to a negative increase of the total energy. It 
is significant that the hairpin structures (b) and (c) show 
the smallest increase of Ann- This can be explained by 
the fact that most of the nucleobases belonging to these 
conformations are not accessible to water molecules lead- 
ing to an avoidance of additional hydrogen bonds. Struc- 
tures like (d), (g) or (f) are more hydrated as it can be 
seen by the increased number of hydrogen bonds which 
results in larger entropy deficits of the environmental en- 
tropy. Thus the main unfolding pathway and the stabil- 
ity of the hairpin conformations can be mainly explained 
by entropically more preferable structures due to a di- 
minished number of hydrogen bonds compared to other 
configurations. 

To understand the drastic variation of the free energy 
landscape for higher temperatures which has been re- 
ported in Ref. [27j , we have analyzed the thermodynamic 
properties at 500 K. It has been shown [27], that at these 
temperatures the hairpin structures are only metastable 
configurations and an extended structures similar to (e) 



represents the global equilibrium conformation. To study 
these properties in detail we repeated the thermodynamic 
calculations with identical structures for a temperature 
of 500 K. The values for the total energy, the free en- 
ergy differences and the temperature-entropy configura- 
tion at 500 K are shown in Tab. IIIII It comes out that 
nearly all values compared to Tab. U are smaller. This 
can be explained by a decreased number of hydrogen 
bonds shown in Tab. IIVI which interact with the i-motif 
due to an increased thermal energy resulting in smaller 
Bjerrum lengths 27]. Additionally it can be assumed 



that the dependence of the entropy on the number of 
hydrogen bonds is less pronounced for 500 K compared 
to 300 K. Hence the values for the total energy and 
the temperature-entropy contribution are lowered which 
leads to a change of the free energy landscape as it can be 
seen by the values of AF in Tab. IIIII compared to Tab. HI 
To investigate this in more detail we have calculated 
the entropic contributions for each configuration at 500 
K. The results are presented in Tab. |lVl All entropy 
values are smaller except the configurational entropy dif- 
ference AS C coming from thermal activation of further 
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FIG. 7: Ratio of the backbone and non-backbone hydrogen 
bonds riH compared to the i-motif n^ moUf for each configura- 
tion at 300 K. Inset: Ratio of the lifetimes of the backbone 
hydrogen bonds Tbb and the non-backbone hydrogen bonds 
r n b for each configuration at 300 K. The lines are only for the 
eyes. 



FIG. 8: Ratio of the number of hydrogen bonds 
riH,500K /riH, 300 k at 500 K and 300 K for non-backbone and 
backbone atoms. The lines are only for the eyes. 



degrees of freedom at elaborated temperatures. Com- 
paring Ann and indicates that the differences be- 
tween the configuration are less pronounced in contrast 
to lower temperatures (Tab. HI]) as it was discussed above. 
This can be related to a weaker dependence of AS e on 
Ann and, second on a general decrease of Ann for cer- 
tain strucutures. Hence the general change of the en- 
tropy for higher temperatures as mentioned above can 
be mainly understood by a variation of alone. It 
can be seen that the dependence of hydrogen bonds on 
the environmental entropy at higher temperatures is less 
important. In addition due to nearly identical entropy 
values for each structure, the contributions of the total 
energy become more important which results in the small 
variations of the free energy differences shown in Tab. HIII 
Hence our results have shown that the variation of the 
free energy landscape at higher temperatures is caused 
by a temperature-dependent entropy which is largely in- 
fluenced by local hydrogen bonds with the solvent. 
To study the properties and the characteristics of Ann 
for different situations and temperatures, we have calcu- 
lated the ratio of hydrogen bonds compared to the i-motif 
for backbone and non-backbone hydrogen bonds individ- 
ually. The results are presented in Fig. for 300 K. It 
is obvious that the backbone ratio is nearly constant for 
each structure. A significant increase of non-backbone 
hydrogen bonds coming from nucleobase water interac- 
tions can be observed for all structures except (c). In 
addition, several structures show an increase of the ac- 
cessible hydrophilic surface area of 7 (b), 11 (d) and 6 
(f,g) nm 2 compared to the i-motif which also supports the 
results shown in Fig. [71 Additionally it can be assumed 
that the hydrogen bonds of the backbone with the water 



are energetically more stable due to larger electrostatic 
interactions. This becomes obvious by regarding the life- 
times of the hydrogen bonds for the backbone T55 and 
for non-backbone atoms r n &. The ratio 755/7^5 is shown 
in the inset of Fig. [71 The reason for this behaviour can 
be explained by higher charged backbone atoms in com- 
parison to lower charged nucleobase atoms which lead to 
energetically more stable bonds. 

As it was discussed above for 300 K, the variation of the 
environmental entropy for each structure is mainly in- 
duced by additionally accessible nucleobase atoms form- 
ing hydrogen bonds with the solvent. Their properties 
at higher temperatures are presented in Fig. where the 
ratio of the number of hydrogen bonds at 500 K and 
300 K riH,500K/nH,300K is shown. It becomes clear that 
lower ratios are given for non-backbone hydrogen bonds 
indicating weaker interactions. By comparison with the 
results for Ann shown in Tabs. HT1 and HVl it can be seen 
that the deficit of Ann at higher temperatures is mainly 
given by a decrease of non-backbone hydrogen bonds re- 
sulting in the variation of the environmental entropy. 
By combining all results, it becomes obvious that the 
environmental entropy for the unfolded conformations is 
significantly lowered for certain conformations compared 
to the i-motif due to additionally accessible non-backbone 
hydrogen bonds. These bonds are energetically less sta- 
ble compared to the backbone hydrogen and are there- 
fore diminished at higher temperatures. This fact leads 
to smaller variations of the environmental entropy com- 
pared to lower temperatures which results in different free 
energy landscapes as it has been observed in Ref. (27| . 
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Experimental results 



The stability of the hairpin structures at 300 K and the 
appearance of the extended structures at higher tempera- 
tures is also supported by regarding the results of circular 
dichroism (CD) spectropolarimetry. The CD spectra at 
two different temperatures 298.15 K and 368.15 K are 
presented in Fig. [9l It has been reported in [29, 30] that 
the DNA i-motif has a maximum at 285 nm, a negative 
minimum at 260 nm and a crossover at 270 nm. Hence 
the results shown in Fig. [9] display an absence of the i- 
motif which is indicated by a shifted maximum at 273 nm, 
a minimum at 250.5 nm and a crossover at 257 nm. In 
pioneering studies [3lJ it has been discussed that a max- 
imum at 270 nm corresponds to a random-coil structure 
which could be also brought into accordance to structures 
(b) and (c) of Fig. [5] Related to this, a further study has 
also assumed the existence of stable hairpin structures at 
pH values of 7.2 based on their experimental CD results 

It was in general validated H, 0] that the intensity of 
the minima is given by the number of connected base 
pairs and intramolecular interaction energies. As it can 
be seen in Fig. [9j the magnitude of the minimum and the 
maximum depends on the applied conditions which indi- 
cates a decrease of intramolecular interactions for higher 
temperatures. This is in agreement to a recent publica- 
tion [27] and to the results of Fig. [3] and the discussion 
above where it has been shown that the global stable con- 
formation at higher temperatures is given by the fully ex- 
tended strand where intramolecular interaction energies 
are largely negligible. The reason for the change of the 
global minima can be explained by a temperature depen- 
dent entropy which favours the expanded conformation 
as it has been validated by the previous results. 
Base stacking energies also have been reported [Hj as 
responsible for larger magnitudes at 255 nm and specif- 
ically for mismatched hairpin configurations. Addition- 
ally it was found that homomismatches of C-C pairing 
are energetically more favourable than other hetero mis- 
matches explaining this unusual stability [Hj. 
The observed CD spectrum can be brought into accor- 
dance with a hairpin B-DNA structure [32| and it can 
be also assumed that the T-A pairing, which has been 
validated by the results of Fig. [2] and Fig. [3] is mainly 
responsible for the stability of these special structures. 
Furthermore it has to be noticed that the i-motif is ab- 
sent due to negative CD spectra minima around 297 nm 
which belong to deprotonated cytosines [35| . Hence it can 
be concluded, that at basic conditions the DNA i-motif is 
absent and hairpin structures represent the global min- 
ima. Our experiments have shown that these configura- 
tions transform into a fully extended strand at elaborated 
temperatures in good agreement to the numerical results. 
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FIG. 9: CD spectra at 298.15 K (black) and 368.15 K at 
pH value 8.0. The small numbers denote the positions of the 
minima and maxima. 



SUMMARY AND CONCLUSION 

We have simulated the unfolding of an unprotonated 
DNA i-motif at 300 K via Molecular Dynamics simula- 
tions in explicit solvent. Our results indicate the planar 
and partly planar configurations as the most stable struc- 
ture in absence of protonated cytosines. These struc- 
tures are realized by an individual opening of each end of 
the strand. The vanishing of the intramolecular hydro- 
gen bonds which are responsible for the formation of the 
DNA i-motif occurs on a time scale which is significantly 
shorter than 10 ns. This validates that the deprotonated 
i-motif is not stable at room temperature in agreement 
to earlier published experimental results [1, E| . A stable 
partly planar configuration appears on a timescale of 7 
ns. 

The calculation of the essential dynamics in combina- 
tion with the corresponding free energy landscape al- 
lows to determine the unfolding pathways. Two main 
unfolding mechanisms can be identified where the pref- 
erential one is entropically more favourable leading to 
lower free energies. We have shown that large contribu- 
tions to the entropy values arise from local water entropy 
deficits which are caused by hydrogen bonds. The nature 
of these bonds and their importance for a temperature- 
dependent entropy has been investigated in detail. Hence 
we have shown that the prior unfolding process is mainly 
determined by entropic contributions dominated by lower 
DNA-solvent ordering interactions. 

The numerical results are in good agreement to experi- 
mental CD spectra. It can be shown that the tempera- 
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ture dependence of the experimentally observed results 
can be explained by a vanishing of hairpin structures at 
higher temperatures due to a variation of the free en- 
ergy landscape. By analyzing the spectral data, it be- 
comes obvious that base stacking energies largely dom- 
inate intramolecular interactions at lower temperatures 
which also validates the presence of hairpin structures 

Thus our results allow to improve the usage of i-motifs 
in nanomachines due to the estimation of the unfolding 
pathway and its stable configurations. As it has been 
discussed above, we have validated that a full unfolding 
into a stretched structure is not energetically favourable 
and can be only realized by thermal or energetic activa- 
tion. This sheds a new light on the importance of the 
unfolding pathway and its stable configurations. It was 
assumed in Ref. [9|, LL4| that the unfolding process of an 
i-motif leads to a fully extended conformation which is 
in contrast to our results. Although it can be concluded 
that the presence of a large number of grafted i-motifs 
may lead to different unfolding pathways due to inter- 
chain interactions, our results allow to optimize the im- 
provement for low grafting densities which are needed 
for the fabrication of nanocontainers [lOj. We currently 
have undertaken attempts to investigate these properties 
in more detail. 

However, even in a biological context our results can be 
adopted for a further study of possible i-motif configu- 
rations in the human cell. As it has been discussed in 
the introduction, this may be important for a pharma- 
ceutical point of view due to the fact that the upstream 
of the i-motif along the insuline gene is related to its 
conformation and therefore facilitates transcription Q. 
Hence we hope that our results allow to achieve a deeper 
insight into the mechanisms of i-motif formation in vivo 
and vitro. 
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